Empirical Performance-Model Driven Data Layout Optimization

نویسندگان

Qingda Lu

Xiaoyang Gao

Sriram Krishnamoorthy

Gerald Baumgartner

J. Ramanujam

P. Sadayappan

چکیده

Empirical optimizers like ATLAS have been very effective in optimizing computational kernels in libraries. The best choice of parameters such as tile size and degree of loop unrolling is determined by executing different versions of the computation. In contrast, optimizing compilers use a model-driven approach to program transformation. While the model-driven approach of optimizing compilers is generally orders of magnitude faster than ATLAS-like library generators, its effectiveness can be limited by the accuracy of the performance models used. Very often, simplified abstractions such as reuse distance are used instead of more accurate metrics like cache miss cost, in order to make modeldriven compiler optimization feasible. In this paper, we describe an approach, in which a class of computations is modeled in terms of its constituent operations that are empirically measured. The empirical measurement of the constituent operations allows modeling of the overall execution time of the computation. The performance model with empirically determined cost components is used to perform data layout optimization in the context of the Tensor Contraction Engine, a compiler for a high-level domain-specific language for expressing computational models in quantum chemistry. This may be viewed as an example of the telescoping language approach to optimizing a sequence of library calls using semantic information about the performance properties of the library routines. The effectiveness of the approach is demonstrated through experimental measurements on some representative computations from quantum chemistry.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Empirical performance model-driven data layout optimization and library call selection for tensor contraction expressions

متن کامل

Automatic Data Layout Using 0-1 Integer Programming

In the optimization world now, even the use of considering the number of processors for parallel programming is carefully considered. The goal of high-level languages is to provide a simple yet efficient machine-independent parallel programming model. The programmer’s data parallel programs should be able to compile and executed with a good performance on many different architectures. However t...

متن کامل

Gong, Zhenhuan. Multi-level Data Layout Optimization for Heterogeneous Access Patterns. (under the Direction of Dr. Nagiza F. Samatova.) Multi-level Data Layout Optimization for Heterogeneous Access Patterns

GONG, ZHENHUAN. Multi-level Data Layout Optimization for Heterogeneous Access Patterns. (Under the direction of Dr. Nagiza F. Samatova.) Recent years have seen an enormous increase in computation power of leadership computing facilities. As a result, huge amounts of data, from terascale to petascale, are being produced by scientific applications running on supercomputers. However, the I/O subsy...

متن کامل

Applications of two new algorithms of cuckoo optimization (CO) and forest optimization (FO) for solving single row facility layout problem (SRFLP)

Nowadays, due to inherent complexity of real optimization problems, it has always been a challenging issue to develop a solution algorithm to these problems. Single row facility layout problem (SRFLP) is a NP-hard problem of arranging a number of rectangular facilities with varying length on one side of a straight line with aim of minimizing the weighted sum of the distance between all facility...

متن کامل

A new layout-driven timing model for incremental layout optimization

In this paper we present a new layout-driven timing model based on Asymptotic Waveform Evaluation (AWE) for improved timing analysis during routing. Our model enables the bottom-up computation of interconnect tree moments, and can be easily integrated with such a global router. Such an integration achieves incremental layout optimization, i.e., timing analysis and routing are tightly coupled, w...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Empirical Performance-Model Driven Data Layout Optimization

نویسندگان

چکیده

منابع مشابه

Empirical performance model-driven data layout optimization and library call selection for tensor contraction expressions

Automatic Data Layout Using 0-1 Integer Programming

Gong, Zhenhuan. Multi-level Data Layout Optimization for Heterogeneous Access Patterns. (under the Direction of Dr. Nagiza F. Samatova.) Multi-level Data Layout Optimization for Heterogeneous Access Patterns

Applications of two new algorithms of cuckoo optimization (CO) and forest optimization (FO) for solving single row facility layout problem (SRFLP)

A new layout-driven timing model for incremental layout optimization

عنوان ژورنال:

اشتراک گذاری